Hierarchical Segmentation of Falsely Touching Characters from Camera Captured Degraded Document Images

نویسندگان

  • Satadal Saha
  • Subhadip Basu
چکیده

An innovative hierarchical image segmentation scheme is reported in this research communication. Unlike static/ spatially divided sub-images, the current innovation concentrates on object level hierarchy for segmentation of gray scale or color images into constituent component/ sub-parts. As for example, a gray scale document image may be segmented (binarized in case of two-level segmentation) into connected foreground components (text/ graphics) and background component by hierarchically applying a gray level threshold selection algorithm in the object-space. In any hierarchy, constituent objects are identified as connected foreground pixels, as classified by the gray scale threshold selection algorithm. To preserve the global information, thresholds for each object in any hierarchy are estimated as a weighted aggregate of the current and previous thresholds relevant to the object. The developed technique may be customized as a general purpose hierarchical information clustering algorithm in the domain of pattern analysis, data mining, bioinformatics etc.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmentation of Camera Captured Business Card Images for Mobile Devices

Due to huge deformation in the camera captured images, variety in nature of the business cards and the computational constraints of the mobile devices, design of an efficient Business Card Reader (BCR) is challenging to the researchers. Extraction of text regions and segmenting them into characters is one of such challenges. In this paper, we have presented an efficient character segmentation t...

متن کامل

A Study of Touching Characters in Degraded Gurmukhi Text

Character segmentation is an important preprocessing step for text recognition. In degraded documents, existence of touching characters decreases recognition rate drastically, for any optical character recognition (OCR) system. In this paper a study of touching Gurmukhi characters is carried out and these characters have been divided into various categories after a careful analysis. Structural ...

متن کامل

On Segmentation of Touching Characters and Overlapping Lines in Degraded Printed Gurmukhi Script

Character segmentation plays a very important role in a text recognition system. The simple technique of using inter-character gap for segmentation is useful for fine printed documents, but this technique fails to give satisfactory results if the input text contains touching characters. In this paper, we have proposed two algorithms to segment touching characters, and one algorithm to segment o...

متن کامل

The Restoration of Camera Documents Through Image Segmentation

This paper presents a document restoration technique that is able to flatten curled document images captured through a digital camera. The proposed method corrects camera images of documents through image partition, which divides distorted text lines into multiple small patches based on the identified vertical stroke boundary (VSB) and the fitted x-line and baseline of text lines. Target rectan...

متن کامل

Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)

Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011